library(tidyverse)library(fastDummies)ratings <-read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-25/ratings.csv')details <-read_csv('https://raw.githubusercontent.com/rfordatascience/tidytuesday/master/data/2022/2022-01-25/details.csv')analytic <-full_join(ratings, details, by ="id") %>%filter(playingtime <=300&# ≤ 300 min playingtime >0&# > 0 min year >1900&# filter out games without years year <2024&# filter out games with too large of years minplayers >0) %>%# require at least 1 player for gamemutate(play60 = playingtime/60) %>%select(id, name, year, average, play60, minplayers) %>%mutate(year2013 =if_else(year >=2013, 1, 0),play_hours =case_when(play60 <=1~1, play60 >1& play60 <=2~2, play60 >2& play60 <=3~3, play60 >3& play60 <=4~4, play60 >4& play60 <=5~5),play_home =if_else(minplayers <=2, 1, 0)) %>%dummy_cols(select_columns ="play_hours") %>%na.omit()
Example 1 - Model
Let’s model the average rating as a function of if the game was made in the last 10 years (year2013), if I can play it at home (play_home), the length of game play (play_hours - categorical!), the interaction between if I can play it at home and if the game was made in the last 10 years, and the interaction between if I can play it at home and the length of game play.
As we see in the model, a categorical \times categorical interaction results in (c_1-1)(c_2-1) terms.
In our example, play_home\timesplay_hours results in 4 terms.
If we want to know if the interaction - overall - is significant, then we must perform the partial F test.
The reduced model removes only the terms related to the specific interaction we are interested in.
e.g., in our example, we would remove play_home:play_hours_2, play_home:play_hours_3, play_home:play_hours_4, play_home:play_hours_5 to determine if the interaction between play_home and play_hours is significant.
Note that in the case of binary \times binary or binary \times continuous interactions, we can use the results from summary().
Example 1 - Testing
Let’s determine which interactions are significant.
There is sufficient evidence to suggest that the relationship between average game rating and a minimum player count of 1 or 2 depends on if the game was made in the last 10 years or not.
There is sufficient evidence to suggest that the relationship between average game rating and a minimum player count of 1 or 2 depends on if the game was made in the last 10 years or not.
Example 1 - Data Visualization
Example 2 - Model
Let’s now model the average rating as a function of if the game was made in the last 10 years (year2013), if I can play it at home (play_home), the length of game play (play60 - continuous!), the interaction between if I can play it at home and if the game was made in the last 10 years, and the interaction between if I can play it at home and the length of game play.